Skip to content

Conversation

jahnvi480
Copy link
Contributor

@jahnvi480 jahnvi480 commented Oct 7, 2025

Work Item / Issue Reference

AB#39059


Summary

This pull request introduces a new global setting, native_uuid, to the mssql_python package, allowing users to control whether UUID values are returned as uuid.UUID objects or as strings. The implementation includes updates to the package initialization, row processing logic, and a comprehensive set of tests to verify the new behavior and ensure backward compatibility.

UUID Handling Improvements:

  • Added a native_uuid setting to the global configuration in mssql_python/__init__.py, defaulting to False for backward compatibility. This setting controls whether UUIDs are returned as uuid.UUID objects or as strings.
  • Updated the Row class in mssql_python/row.py to check the native_uuid setting and convert UUID values to strings when native_uuid is False, ensuring consistent output based on configuration.

Testing Enhancements:

  • Updated and extended tests in tests/test_004_cursor.py to verify correct UUID handling for both native_uuid=True and native_uuid=False, including new tests for the setting and resetting of the native_uuid option.

Internal Refactoring:

  • Removed unused UUID mapping logic in mssql_python/cursor.py that is now handled via the new setting and row processing logic.
  • Minor import and code organization cleanups in affected modules for clarity and maintainability.

These changes provide greater flexibility and control over how UUIDs are handled in query results, improving the usability of the package in different application contexts.

@Copilot Copilot AI review requested due to automatic review settings October 7, 2025 06:25
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new native_uuid global setting to control whether UUID values are returned as uuid.UUID objects or strings, with a default value of False for backward compatibility.

Key changes:

  • Added native_uuid setting to global configuration with backward-compatible default
  • Updated row processing logic to convert UUID objects to strings when native_uuid=False
  • Enhanced test coverage to verify UUID handling behavior for both setting states

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
mssql_python/init.py Added native_uuid setting initialization and module-level access
mssql_python/row.py Implemented UUID conversion logic based on native_uuid setting
mssql_python/cursor.py Removed unused UUID mapping code
tests/test_004_cursor.py Added comprehensive tests for native_uuid setting and updated existing UUID tests

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-actions github-actions bot added the pr-size: medium Moderate update size label Oct 7, 2025
self.lowercase = False
# Use the pre-determined separator - no locale access here
self.decimal_separator = _DEFAULT_DECIMAL_SEPARATOR
self.native_uuid = False # Default to False for backwards compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a boolean-only, locked setter for native_uuid (reject non-bool inputs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the required changes

Copy link
Contributor

@sumitmsft sumitmsft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments

Copy link

📊 Code Coverage Report

🔥 Diff Coverage

63%


🎯 Overall Coverage

74%


📈 Total Lines Covered: 4214 out of 5668
📁 Project: mssql-python


Diff Coverage

Diff: main...HEAD, staged and unstaged changes

  • mssql_python/init.py (35.7%): Missing lines 178-179,182,185,188-189,191-192,194
  • mssql_python/cursor.py (100%)
  • mssql_python/row.py (62.9%): Missing lines 27-28,43,81,83-86,89-91,93,95-101,160,176,186-187

Summary

  • Total: 88 lines
  • Missing: 32 lines
  • Coverage: 63%

mssql_python/init.py

Lines 174-198

  174 
  175 def _custom_setattr(name, value):
  176     if name == 'lowercase':
  177         # Strict boolean type check for lowercase
! 178         if not isinstance(value, bool):
! 179             raise ValueError("lowercase must be a boolean value (True or False)")
  180         
  181         with _settings_lock:
! 182             _settings.lowercase = value
  183             # Update the module's lowercase variable
  184             _original_module_setattr(name, _settings.lowercase)
! 185     elif name == 'native_uuid':
  186 
  187         # Strict boolean type check for native_uuid
! 188         if not isinstance(value, bool):
! 189             raise ValueError("native_uuid must be a boolean value (True or False)")
  190         
! 191         with _settings_lock:
! 192             _settings.native_uuid = value
  193             # Update the module's native_uuid variable
! 194             _original_module_setattr(name, _settings.native_uuid)
  195     else:
  196         _original_module_setattr(name, value)
  197 
  198 # Replace the module's __setattr__ with our custom version

mssql_python/row.py

Lines 23-32

  23         # Use settings snapshot if provided, otherwise fallback to global settings
  24         if settings_snapshot is not None:
  25             self._settings = settings_snapshot
  26         else:
! 27             settings = get_settings()
! 28             self._settings = {
  29                 'lowercase': settings.lowercase,
  30                 'native_uuid': settings.native_uuid
  31             }
  32         # Create mapping of column names to indices first

Lines 39-47

  39                     if self._settings.get('lowercase'):
  40                         col_name = col_name.lower()
  41                     self._column_map[col_name] = i
  42         else:
! 43             self._column_map = column_map
  44         
  45         # First make a mutable copy of values
  46         processed_values = list(values)
  47         

Lines 77-105

   77                 for i in uuid_indices:
   78                     if i < len(processed_values) and processed_values[i] is not None:
   79                         value = processed_values[i]
   80                         if isinstance(value, str):
!  81                             try:
   82                                 # Remove braces if present
!  83                                 clean_value = value.strip('{}')
!  84                                 processed_values[i] = uuid.UUID(clean_value)
!  85                             except (ValueError, AttributeError):
!  86                                 pass  # Keep original if conversion fails
   87             # Fallback to scanning all columns if indices weren't pre-identified
   88             else:
!  89                 for i, value in enumerate(processed_values):
!  90                     if value is None:
!  91                         continue
   92                     
!  93                     if i < len(description) and description[i]:
   94                         # Check SQL type for UNIQUEIDENTIFIER (-11)
!  95                         sql_type = description[i][1]
!  96                         if sql_type == -11:  # SQL_GUID
!  97                             if isinstance(value, str):
!  98                                 try:
!  99                                     processed_values[i] = uuid.UUID(value.strip('{}'))
! 100                                 except (ValueError, AttributeError):
! 101                                     pass
  102         # When native_uuid is False, convert UUID objects to strings
  103         else:
  104             for i, value in enumerate(processed_values):
  105                 if isinstance(value, uuid.UUID):

Lines 156-164

  156                         converted_values[i] = converter(value)
  157                 except Exception as e:
  158                     # Log the exception for debugging without leaking sensitive data
  159                     if hasattr(self._cursor, 'log'):
! 160                         self._cursor.log('debug', f'Exception occurred in output converter: {type(e).__name__}', exc_info=True)
  161                     # If conversion fails, keep the original value
  162                     pass
  163         
  164         return converted_values

Lines 172-180

  172         Allow accessing by column name as attribute: row.column_name
  173         """
  174         # _column_map should already be set in __init__, but check to be safe
  175         if not hasattr(self, '_column_map'):
! 176             self._column_map = {}
  177             
  178         # Try direct lookup first
  179         if name in self._column_map:
  180             return self._values[self._column_map[name]]

Lines 182-191

  182         # Use the snapshot lowercase setting instead of global
  183         if self._settings.get('lowercase'):
  184             # If lowercase is enabled, try case-insensitive lookup
  185             name_lower = name.lower()
! 186             if name_lower in self._column_map:
! 187                 return self._values[self._column_map[name_lower]]
  188         
  189         raise AttributeError(f"Row has no attribute '{name}'")
  190     
  191     def __eq__(self, other):


📋 Files Needing Attention

📉 Files with overall lowest coverage (click to expand)
mssql_python.pybind.connection.connection.cpp: 68.3%
mssql_python.ddbc_bindings.py: 68.5%
mssql_python.pybind.ddbc_bindings.cpp: 69.3%
mssql_python.row.py: 77.8%
mssql_python.pybind.connection.connection_pool.cpp: 78.9%
mssql_python.connection.py: 81.7%
mssql_python.helpers.py: 84.7%
mssql_python.auth.py: 85.3%
mssql_python.__init__.py: 86.4%
mssql_python.type.py: 86.8%

🔗 Quick Links

⚙️ Build Summary 📋 Coverage Details

View Azure DevOps Build

Browse Full Coverage Report

# Identify UUID columns (SQL_GUID = -11)
self._uuid_indices = []
for i, desc in enumerate(self.description):
if desc and desc[1] == uuid.UUID: # Column type code at index 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UUID column detection uses if desc and desc[1] == uuid.UUID, relying on description tuples storing Python types. Elsewhere UUID handling appears to test for the integer SQL_GUID (-11). This inconsistency will prevent reliable precomputation of self._uuid_indices, causing unnecessary per-row scanning for UUID conversion.

Can we standardize on either integer SQL type codes (e.g., define SQL_GUID = ddbc_sql_const.SQL_GUID.value and store that in description) or Python type objects and update all UUID detection logic accordingly. If choosing integer codes, line 1020 should become if desc and desc[1] == SQL_GUID:

# If describe fails, it's likely there are no results (e.g., for INSERT)
self.description = None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 999–1007 perform the single metadata initialization after execute(). Did you remove a previously redundant second initialization here?

Please confirm (or document) that the driver always exposes complete column metadata immediately after DDBCSQLExecute for all result-producing statements (including stored procedures, catalog functions).
If there are edge cases where metadata becomes fully available only after the first fetch or SQLMoreResults, the removal could be a regression (UUID indices and type-dependent logic would be impacted).
If guarantees exist, consider adding a brief comment above this block stating that one describe pass is sufficient. Optionally add a lightweight invariant check (e.g., all tuples have length 7; GUID columns mapped as expected) to fail fast in unusual scenarios.

If you verify that this is satisfied, you can safely proceed to just make it explicit so future contributors don’t reintroduce a second describe call or wonder why there is only one.

converted_values[i] = converter(value_bytes)
elif isinstance(value, int):
# For integers, we'll convert to bytes
value_bytes = value.to_bytes(8, byteorder='little')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integers are always converted to 8 little-endian bytes via value.to_bytes(8, byteorder='little'). This introduces multiple issues:

  1. If the value doesn’t fit in 8 bytes, to_bytes(8, ...) raises OverflowError. This is silently swallowed by the broad except block and the value is left unconverted which will be hard to diagnose.
  2. Wrong widths for SQL integer types:
    TINYINT → 1 byte
    SMALLINT → 2 bytes
    INTEGER → 4 bytes
    BIGINT → 8 bytes
    Padding everything to 8 bytes can break converters expecting the exact width.
  3. Negative values need signed=True; the current call omits it and may fail or misrepresent data.
  4. The except block logs and keeps the original value, hiding format incompatibilities and causing inconsistent behavior (some ints converted, others not).

Can you pls make a note of this problem and fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-size: medium Moderate update size

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants